Overview

Dataset statistics

Number of variables15
Number of observations18333
Missing cells18333
Missing cells (%)6.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 MiB
Average record size in memory98.0 B

Variable types

Numeric8
Categorical6
Boolean1

Warnings

car_type has a high cardinality: 373 distinct values High cardinality
first_date has a high cardinality: 214 distinct values High cardinality
average_price is highly correlated with price and 1 other fieldsHigh correlation
price is highly correlated with average_price and 1 other fieldsHigh correlation
diff_price is highly correlated with average_price and 2 other fieldsHigh correlation
diff_price_perc is highly correlated with diff_priceHigh correlation
average_price is highly correlated with price and 1 other fieldsHigh correlation
price is highly correlated with average_price and 1 other fieldsHigh correlation
diff_price is highly correlated with average_price and 2 other fieldsHigh correlation
diff_price_perc is highly correlated with diff_priceHigh correlation
average_price is highly correlated with priceHigh correlation
price is highly correlated with average_price and 1 other fieldsHigh correlation
diff_price is highly correlated with price and 1 other fieldsHigh correlation
diff_price_perc is highly correlated with diff_priceHigh correlation
sold has 11673 (63.7%) missing values Missing
first_date has 6660 (36.3%) missing values Missing
car_id is uniformly distributed Uniform
car_id has unique values Unique
antiquity has 405 (2.2%) zeros Zeros

Reproduction

Analysis started2021-06-01 04:59:37.805003
Analysis finished2021-06-01 04:59:45.918350
Duration8.11 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

car_id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct18333
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9997.398189
Minimum0
Maximum19999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size143.4 KiB
2021-05-31T23:59:46.015004image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1015.6
Q15013
median9991
Q314995
95-th percentile18988.4
Maximum19999
Range19999
Interquartile range (IQR)9982

Descriptive statistics

Standard deviation5766.700943
Coefficient of variation (CV)0.576820172
Kurtosis-1.199077161
Mean9997.398189
Median Absolute Deviation (MAD)4992
Skewness-0.000379208247
Sum183282301
Variance33254839.77
MonotonicityNot monotonic
2021-05-31T23:59:46.127983image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
95341
 
< 0.1%
13541
 
< 0.1%
74971
 
< 0.1%
54481
 
< 0.1%
197791
 
< 0.1%
177301
 
< 0.1%
115831
 
< 0.1%
156771
 
< 0.1%
136441
 
< 0.1%
Other values (18323)18323
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
71
< 0.1%
91
< 0.1%
101
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
199991
< 0.1%
199981
< 0.1%
199971
< 0.1%
199961
< 0.1%
199951
< 0.1%
199941
< 0.1%
199931
< 0.1%
199911
< 0.1%
199901
< 0.1%
199891
< 0.1%

car_type
Categorical

HIGH CARDINALITY

Distinct373
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size143.4 KiB
Serie 1
 
170
Ram 1500 V6
 
158
Odyssey
 
140
CR-V
 
138
March
 
135
Other values (368)
17592 

Length

Max length25
Median length6
Mean length6.88610702
Min length2

Characters and Unicode

Total characters126243
Distinct characters65
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSupra
2nd rowSupra
3rd rowSupra
4th rowSupra
5th rowSupra

Common Values

ValueCountFrequency (%)
Serie 1170
 
0.9%
Ram 1500 V6158
 
0.9%
Odyssey140
 
0.8%
CR-V138
 
0.8%
March135
 
0.7%
Serie 7135
 
0.7%
Transit134
 
0.7%
Terrain132
 
0.7%
Saveiro132
 
0.7%
Cherokee132
 
0.7%
Other values (363)16927
92.3%

Length

2021-05-31T23:59:46.350340image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
clase924
 
3.7%
serie693
 
2.8%
ram344
 
1.4%
cargo338
 
1.3%
van336
 
1.3%
quattro300
 
1.2%
rover285
 
1.1%
range285
 
1.1%
v6283
 
1.1%
1500263
 
1.0%
Other values (380)21113
83.9%

Most occurring characters

ValueCountFrequency (%)
a11755
 
9.3%
e10239
 
8.1%
r9947
 
7.9%
6904
 
5.5%
o6206
 
4.9%
i5744
 
4.5%
n5294
 
4.2%
t4985
 
3.9%
C4394
 
3.5%
s4007
 
3.2%
Other values (55)56768
45.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter80194
63.5%
Uppercase Letter27798
 
22.0%
Decimal Number10795
 
8.6%
Space Separator6904
 
5.5%
Dash Punctuation552
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a11755
14.7%
e10239
12.8%
r9947
12.4%
o6206
 
7.7%
i5744
 
7.2%
n5294
 
6.6%
t4985
 
6.2%
s4007
 
5.0%
l3014
 
3.8%
u2446
 
3.1%
Other values (17)16557
20.6%
Uppercase Letter
ValueCountFrequency (%)
C4394
15.8%
S3834
13.8%
R2122
 
7.6%
M1768
 
6.4%
T1678
 
6.0%
V1493
 
5.4%
A1442
 
5.2%
X1405
 
5.1%
L1403
 
5.0%
E1257
 
4.5%
Other values (16)7002
25.2%
Decimal Number
ValueCountFrequency (%)
03595
33.3%
51478
13.7%
31034
 
9.6%
1891
 
8.3%
6819
 
7.6%
4798
 
7.4%
8780
 
7.2%
2719
 
6.7%
7572
 
5.3%
9109
 
1.0%
Space Separator
ValueCountFrequency (%)
6904
100.0%
Dash Punctuation
ValueCountFrequency (%)
-552
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin107992
85.5%
Common18251
 
14.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a11755
 
10.9%
e10239
 
9.5%
r9947
 
9.2%
o6206
 
5.7%
i5744
 
5.3%
n5294
 
4.9%
t4985
 
4.6%
C4394
 
4.1%
s4007
 
3.7%
S3834
 
3.6%
Other values (43)41587
38.5%
Common
ValueCountFrequency (%)
6904
37.8%
03595
19.7%
51478
 
8.1%
31034
 
5.7%
1891
 
4.9%
6819
 
4.5%
4798
 
4.4%
8780
 
4.3%
2719
 
3.9%
7572
 
3.1%
Other values (2)661
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII126220
> 99.9%
Latin 1 Sup23
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a11755
 
9.3%
e10239
 
8.1%
r9947
 
7.9%
6904
 
5.5%
o6206
 
4.9%
i5744
 
4.6%
n5294
 
4.2%
t4985
 
3.9%
C4394
 
3.5%
s4007
 
3.2%
Other values (54)56745
45.0%
Latin 1 Sup
ValueCountFrequency (%)
ó23
100.0%

color
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size143.4 KiB
Black
6415 
White
4625 
Red
3706 
Blue
3035 
Orange
 
552

Length

Max length6
Median length5
Mean length4.460262914
Min length3

Characters and Unicode

Total characters81770
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBlue
2nd rowBlack
3rd rowBlack
4th rowBlue
5th rowBlack

Common Values

ValueCountFrequency (%)
Black6415
35.0%
White4625
25.2%
Red3706
20.2%
Blue3035
16.6%
Orange552
 
3.0%

Length

2021-05-31T23:59:46.520762image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-31T23:59:46.588867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
black6415
35.0%
white4625
25.2%
red3706
20.2%
blue3035
16.6%
orange552
 
3.0%

Most occurring characters

ValueCountFrequency (%)
e11918
14.6%
B9450
11.6%
l9450
11.6%
a6967
8.5%
c6415
7.8%
k6415
7.8%
W4625
 
5.7%
h4625
 
5.7%
i4625
 
5.7%
t4625
 
5.7%
Other values (7)12655
15.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter63437
77.6%
Uppercase Letter18333
 
22.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e11918
18.8%
l9450
14.9%
a6967
11.0%
c6415
10.1%
k6415
10.1%
h4625
 
7.3%
i4625
 
7.3%
t4625
 
7.3%
d3706
 
5.8%
u3035
 
4.8%
Other values (3)1656
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
B9450
51.5%
W4625
25.2%
R3706
 
20.2%
O552
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Latin81770
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e11918
14.6%
B9450
11.6%
l9450
11.6%
a6967
8.5%
c6415
7.8%
k6415
7.8%
W4625
 
5.7%
h4625
 
5.7%
i4625
 
5.7%
t4625
 
5.7%
Other values (7)12655
15.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII81770
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e11918
14.6%
B9450
11.6%
l9450
11.6%
a6967
8.5%
c6415
7.8%
k6415
7.8%
W4625
 
5.7%
h4625
 
5.7%
i4625
 
5.7%
t4625
 
5.7%
Other values (7)12655
15.5%

km
Real number (ℝ≥0)

Distinct18328
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25285230.58
Minimum1394070
Maximum49452737
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size71.7 KiB
2021-05-31T23:59:46.666265image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1394070
5-th percentile10621811.8
Q118990785
median25306193
Q331534749
95-th percentile40074969
Maximum49452737
Range48058667
Interquartile range (IQR)12543964

Descriptive statistics

Standard deviation8920231.687
Coefficient of variation (CV)0.352784273
Kurtosis-0.4563884981
Mean25285230.58
Median Absolute Deviation (MAD)6266302
Skewness0.02280199573
Sum4.635541321 × 1011
Variance7.957053335 × 1013
MonotonicityNot monotonic
2021-05-31T23:59:46.791098image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
261477152
 
< 0.1%
301795282
 
< 0.1%
356753272
 
< 0.1%
283612312
 
< 0.1%
188044422
 
< 0.1%
283915991
 
< 0.1%
271762871
 
< 0.1%
295569171
 
< 0.1%
141140971
 
< 0.1%
202376191
 
< 0.1%
Other values (18318)18318
99.9%
ValueCountFrequency (%)
13940701
< 0.1%
16204261
< 0.1%
16461431
< 0.1%
16608071
< 0.1%
17085901
< 0.1%
18182471
< 0.1%
18596421
< 0.1%
18903641
< 0.1%
18919431
< 0.1%
20451341
< 0.1%
ValueCountFrequency (%)
494527371
< 0.1%
494162011
< 0.1%
494042851
< 0.1%
490611301
< 0.1%
488225021
< 0.1%
487072641
< 0.1%
486957281
< 0.1%
486485191
< 0.1%
486170541
< 0.1%
485704261
< 0.1%

average_price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct700
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean551575.625
Minimum101729
Maximum997705
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size71.7 KiB
2021-05-31T23:59:46.899394image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum101729
5-th percentile149481
Q1330625
median559715
Q3779453
95-th percentile951729
Maximum997705
Range895976
Interquartile range (IQR)448828

Descriptive statistics

Standard deviation257173.25
Coefficient of variation (CV)0.4662520289
Kurtosis-1.208065629
Mean551575.625
Median Absolute Deviation (MAD)226560
Skewness-0.02829320543
Sum1.011203584 × 1010
Variance6.613808333 × 1010
MonotonicityNot monotonic
2021-05-31T23:59:47.004866image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30781245
 
0.2%
23818443
 
0.2%
13342042
 
0.2%
82025140
 
0.2%
67102639
 
0.2%
46992339
 
0.2%
59839538
 
0.2%
19637238
 
0.2%
65298838
 
0.2%
12190938
 
0.2%
Other values (690)17933
97.8%
ValueCountFrequency (%)
10172930
0.2%
10248621
0.1%
10613226
0.1%
10622729
0.2%
10758322
0.1%
10805632
0.2%
11264319
0.1%
11727727
0.1%
12098527
0.1%
12190938
0.2%
ValueCountFrequency (%)
99770519
0.1%
99710425
0.1%
99706737
0.2%
99348124
0.1%
99217926
0.1%
99216726
0.1%
99087816
0.1%
98896823
0.1%
98671129
0.2%
98607833
0.2%

transmission
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size143.4 KiB
Automatic
14540 
Manual
3793 

Length

Max length9
Median length9
Mean length8.379315988
Min length6

Characters and Unicode

Total characters153618
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowManual
2nd rowManual
3rd rowManual
4th rowManual
5th rowManual

Common Values

ValueCountFrequency (%)
Automatic14540
79.3%
Manual3793
 
20.7%

Length

2021-05-31T23:59:47.178343image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-31T23:59:47.237368image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
automatic14540
79.3%
manual3793
 
20.7%

Most occurring characters

ValueCountFrequency (%)
t29080
18.9%
a22126
14.4%
u18333
11.9%
A14540
9.5%
o14540
9.5%
m14540
9.5%
i14540
9.5%
c14540
9.5%
M3793
 
2.5%
n3793
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter135285
88.1%
Uppercase Letter18333
 
11.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t29080
21.5%
a22126
16.4%
u18333
13.6%
o14540
10.7%
m14540
10.7%
i14540
10.7%
c14540
10.7%
n3793
 
2.8%
l3793
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
A14540
79.3%
M3793
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
Latin153618
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t29080
18.9%
a22126
14.4%
u18333
11.9%
A14540
9.5%
o14540
9.5%
m14540
9.5%
i14540
9.5%
c14540
9.5%
M3793
 
2.5%
n3793
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII153618
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t29080
18.9%
a22126
14.4%
u18333
11.9%
A14540
9.5%
o14540
9.5%
m14540
9.5%
i14540
9.5%
c14540
9.5%
M3793
 
2.5%
n3793
 
2.5%

body_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size143.4 KiB
Sedan
10797 
SUV
4544 
Coupe
2160 
Truck
 
832

Length

Max length5
Median length5
Mean length4.504281896
Min length3

Characters and Unicode

Total characters82577
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSedan
2nd rowSedan
3rd rowSedan
4th rowSedan
5th rowSedan

Common Values

ValueCountFrequency (%)
Sedan10797
58.9%
SUV4544
24.8%
Coupe2160
 
11.8%
Truck832
 
4.5%

Length

2021-05-31T23:59:47.380794image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-31T23:59:47.440417image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
sedan10797
58.9%
suv4544
24.8%
coupe2160
 
11.8%
truck832
 
4.5%

Most occurring characters

ValueCountFrequency (%)
S15341
18.6%
e12957
15.7%
d10797
13.1%
a10797
13.1%
n10797
13.1%
U4544
 
5.5%
V4544
 
5.5%
u2992
 
3.6%
C2160
 
2.6%
o2160
 
2.6%
Other values (5)5488
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter55156
66.8%
Uppercase Letter27421
33.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e12957
23.5%
d10797
19.6%
a10797
19.6%
n10797
19.6%
u2992
 
5.4%
o2160
 
3.9%
p2160
 
3.9%
r832
 
1.5%
c832
 
1.5%
k832
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
S15341
55.9%
U4544
 
16.6%
V4544
 
16.6%
C2160
 
7.9%
T832
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Latin82577
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S15341
18.6%
e12957
15.7%
d10797
13.1%
a10797
13.1%
n10797
13.1%
U4544
 
5.5%
V4544
 
5.5%
u2992
 
3.6%
C2160
 
2.6%
o2160
 
2.6%
Other values (5)5488
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII82577
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S15341
18.6%
e12957
15.7%
d10797
13.1%
a10797
13.1%
n10797
13.1%
U4544
 
5.5%
V4544
 
5.5%
u2992
 
3.6%
C2160
 
2.6%
o2160
 
2.6%
Other values (5)5488
 
6.6%

DI
Real number (ℝ)

Distinct102
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.716576665
Minimum-1
Maximum131
Zeros0
Zeros (%)0.0%
Negative11673
Negative (%)63.7%
Memory size143.4 KiB
2021-05-31T23:59:47.513708image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q311
95-th percentile43
Maximum131
Range132
Interquartile range (IQR)12

Descriptive statistics

Standard deviation15.57532101
Coefficient of variation (CV)2.018423673
Kurtosis4.214797611
Mean7.716576665
Median Absolute Deviation (MAD)0
Skewness2.051958201
Sum141468
Variance242.5906247
MonotonicityNot monotonic
2021-05-31T23:59:47.616980image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-111673
63.7%
1235
 
1.3%
3222
 
1.2%
2220
 
1.2%
4203
 
1.1%
5202
 
1.1%
6194
 
1.1%
8189
 
1.0%
10176
 
1.0%
7170
 
0.9%
Other values (92)4849
26.4%
ValueCountFrequency (%)
-111673
63.7%
1235
 
1.3%
2220
 
1.2%
3222
 
1.2%
4203
 
1.1%
5202
 
1.1%
6194
 
1.1%
7170
 
0.9%
8189
 
1.0%
9166
 
0.9%
ValueCountFrequency (%)
1311
< 0.1%
1151
< 0.1%
1131
< 0.1%
1111
< 0.1%
1071
< 0.1%
1051
< 0.1%
1011
< 0.1%
991
< 0.1%
961
< 0.1%
952
< 0.1%

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct18152
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean603549.0625
Minimum98972
Maximum1175730
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size71.7 KiB
2021-05-31T23:59:47.897372image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum98972
5-th percentile164036.8
Q1357402
median608859
Q3851566
95-th percentile1041221.4
Maximum1175730
Range1076758
Interquartile range (IQR)494164

Descriptive statistics

Standard deviation283352.125
Coefficient of variation (CV)0.4694765508
Kurtosis-1.174217105
Mean603549.0625
Median Absolute Deviation (MAD)246789
Skewness-0.002718279604
Sum1.106486477 × 1010
Variance8.028842394 × 1010
MonotonicityNot monotonic
2021-05-31T23:59:48.001677image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2001313
 
< 0.1%
8655622
 
< 0.1%
2533012
 
< 0.1%
8324612
 
< 0.1%
9924102
 
< 0.1%
4002422
 
< 0.1%
5520632
 
< 0.1%
2456092
 
< 0.1%
6465952
 
< 0.1%
5809632
 
< 0.1%
Other values (18142)18312
99.9%
ValueCountFrequency (%)
989721
< 0.1%
993991
< 0.1%
1034101
< 0.1%
1034781
< 0.1%
1038651
< 0.1%
1039221
< 0.1%
1041341
< 0.1%
1044321
< 0.1%
1045671
< 0.1%
1046721
< 0.1%
ValueCountFrequency (%)
11757301
< 0.1%
11720701
< 0.1%
11720071
< 0.1%
11714351
< 0.1%
11687991
< 0.1%
11685161
< 0.1%
11663561
< 0.1%
11661901
< 0.1%
11650031
< 0.1%
11640501
< 0.1%

sold
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing11673
Missing (%)63.7%
Memory size35.9 KiB
True
5507 
False
 
1153
(Missing)
11673 
ValueCountFrequency (%)
True5507
30.0%
False1153
 
6.3%
(Missing)11673
63.7%
2021-05-31T23:59:48.068085image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

first_date
Categorical

HIGH CARDINALITY
MISSING

Distinct214
Distinct (%)1.8%
Missing6660
Missing (%)36.3%
Memory size143.4 KiB
2019-08-27
 
75
2019-06-10
 
72
2019-07-24
 
72
2019-08-12
 
71
2019-06-21
 
71
Other values (209)
11312 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters116730
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019-08-14
2nd row2019-12-28
3rd row2019-11-05
4th row2019-08-26
5th row2019-08-07

Common Values

ValueCountFrequency (%)
2019-08-2775
 
0.4%
2019-06-1072
 
0.4%
2019-07-2472
 
0.4%
2019-08-1271
 
0.4%
2019-06-2171
 
0.4%
2019-12-0470
 
0.4%
2019-08-2069
 
0.4%
2019-06-3069
 
0.4%
2019-12-2568
 
0.4%
2019-06-1567
 
0.4%
Other values (204)10969
59.8%
(Missing)6660
36.3%

Length

2021-05-31T23:59:48.259636image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2019-08-2775
 
0.6%
2019-07-2472
 
0.6%
2019-06-1072
 
0.6%
2019-08-1271
 
0.6%
2019-06-2171
 
0.6%
2019-12-0470
 
0.6%
2019-08-2069
 
0.6%
2019-06-3069
 
0.6%
2019-12-2568
 
0.6%
2019-06-1567
 
0.6%
Other values (204)10969
94.0%

Most occurring characters

ValueCountFrequency (%)
024711
21.2%
-23346
20.0%
123340
20.0%
218332
15.7%
914405
12.3%
82931
 
2.5%
62832
 
2.4%
72797
 
2.4%
31752
 
1.5%
51155
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number93384
80.0%
Dash Punctuation23346
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
024711
26.5%
123340
25.0%
218332
19.6%
914405
15.4%
82931
 
3.1%
62832
 
3.0%
72797
 
3.0%
31752
 
1.9%
51155
 
1.2%
41129
 
1.2%
Dash Punctuation
ValueCountFrequency (%)
-23346
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common116730
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
024711
21.2%
-23346
20.0%
123340
20.0%
218332
15.7%
914405
12.3%
82931
 
2.5%
62832
 
2.4%
72797
 
2.4%
31752
 
1.5%
51155
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII116730
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
024711
21.2%
-23346
20.0%
123340
20.0%
218332
15.7%
914405
12.3%
82931
 
2.5%
62832
 
2.4%
72797
 
2.4%
31752
 
1.5%
51155
 
1.0%

diff_price
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct16919
Distinct (%)92.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-51973.45732
Minimum-178663
Maximum10314
Zeros0
Zeros (%)0.0%
Negative18294
Negative (%)99.8%
Memory size143.4 KiB
2021-05-31T23:59:48.356833image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum-178663
5-th percentile-129140.8
Q1-76580
median-42721
Q3-20134
95-th percentile-6497.8
Maximum10314
Range188977
Interquartile range (IQR)56446

Descriptive statistics

Standard deviation38664.47826
Coefficient of variation (CV)-0.7439273864
Kurtosis-0.09974256639
Mean-51973.45732
Median Absolute Deviation (MAD)25644
Skewness-0.842080187
Sum-952829393
Variance1494941879
MonotonicityNot monotonic
2021-05-31T23:59:48.460577image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-349814
 
< 0.1%
-133294
 
< 0.1%
-379774
 
< 0.1%
-474134
 
< 0.1%
-212484
 
< 0.1%
-175504
 
< 0.1%
-205864
 
< 0.1%
-133254
 
< 0.1%
-67133
 
< 0.1%
-473913
 
< 0.1%
Other values (16909)18295
99.8%
ValueCountFrequency (%)
-1786631
< 0.1%
-1785261
< 0.1%
-1766321
< 0.1%
-1755681
< 0.1%
-1754601
< 0.1%
-1753631
< 0.1%
-1750821
< 0.1%
-1750351
< 0.1%
-1750031
< 0.1%
-1741771
< 0.1%
ValueCountFrequency (%)
103141
< 0.1%
86111
< 0.1%
65031
< 0.1%
56981
< 0.1%
45851
< 0.1%
37881
< 0.1%
28071
< 0.1%
27801
< 0.1%
27431
< 0.1%
25801
< 0.1%

diff_price_perc
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct18329
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.09366516909
Minimum-0.179993965
Maximum0.08004052685
Zeros0
Zeros (%)0.0%
Negative18294
Negative (%)99.8%
Memory size143.4 KiB
2021-05-31T23:59:48.567103image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum-0.179993965
5-th percentile-0.1694551493
Q1-0.1355795484
median-0.09432077725
Q3-0.05135914869
95-th percentile-0.01727285364
Maximum0.08004052685
Range0.2600344918
Interquartile range (IQR)0.08422039967

Descriptive statistics

Standard deviation0.04900667101
Coefficient of variation (CV)-0.5232112586
Kurtosis-1.163080009
Mean-0.09366516909
Median Absolute Deviation (MAD)0.04207049309
Skewness0.03080984319
Sum-1717.163545
Variance0.002401653803
MonotonicityNot monotonic
2021-05-31T23:59:48.672381image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.12841441372
 
< 0.1%
-0.14705915582
 
< 0.1%
-0.096724080632
 
< 0.1%
-0.094215327282
 
< 0.1%
-0.095743035871
 
< 0.1%
-0.11365588621
 
< 0.1%
-0.1652558781
 
< 0.1%
-0.078082134411
 
< 0.1%
-0.068091197921
 
< 0.1%
-0.11108046141
 
< 0.1%
Other values (18319)18319
99.9%
ValueCountFrequency (%)
-0.1799939651
< 0.1%
-0.17998644961
< 0.1%
-0.17997373481
< 0.1%
-0.17996756711
< 0.1%
-0.17993977211
< 0.1%
-0.17992603631
< 0.1%
-0.17992348881
< 0.1%
-0.17991076241
< 0.1%
-0.17988686011
< 0.1%
-0.17987594631
< 0.1%
ValueCountFrequency (%)
0.080040526851
< 0.1%
0.051101906541
< 0.1%
0.030474167751
< 0.1%
0.02919521031
< 0.1%
0.022903990011
< 0.1%
0.022716913081
< 0.1%
0.017264036491
< 0.1%
0.016876498331
< 0.1%
0.01338250071
< 0.1%
0.012939177931
< 0.1%

car_brand
Categorical

Distinct47
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size143.4 KiB
Nissan
1303 
Mercedes Benz
1279 
Chevrolet
 
1222
Ford
 
1173
Volkswagen
 
1082
Other values (42)
12274 

Length

Max length13
Median length6
Mean length6.47182676
Min length3

Characters and Unicode

Total characters118648
Distinct characters43
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowToyota
2nd rowToyota
3rd rowToyota
4th rowToyota
5th rowToyota

Common Values

ValueCountFrequency (%)
Nissan1303
 
7.1%
Mercedes Benz1279
 
7.0%
Chevrolet1222
 
6.7%
Ford1173
 
6.4%
Volkswagen1082
 
5.9%
Audi1040
 
5.7%
Dodge1038
 
5.7%
BMW909
 
5.0%
Honda890
 
4.9%
Toyota856
 
4.7%
Other values (37)7541
41.1%

Length

2021-05-31T23:59:48.878370image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
nissan1303
 
6.5%
benz1279
 
6.4%
mercedes1279
 
6.4%
chevrolet1222
 
6.1%
ford1173
 
5.8%
volkswagen1082
 
5.4%
audi1040
 
5.2%
dodge1038
 
5.2%
bmw909
 
4.5%
honda890
 
4.4%
Other values (38)8879
44.2%

Most occurring characters

ValueCountFrequency (%)
e13773
 
11.6%
o9825
 
8.3%
a8364
 
7.0%
n7068
 
6.0%
d6537
 
5.5%
s6323
 
5.3%
r5659
 
4.8%
i5631
 
4.7%
t4526
 
3.8%
u4397
 
3.7%
Other values (33)46545
39.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter93615
78.9%
Uppercase Letter23242
 
19.6%
Space Separator1791
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e13773
14.7%
o9825
10.5%
a8364
 
8.9%
n7068
 
7.6%
d6537
 
7.0%
s6323
 
6.8%
r5659
 
6.0%
i5631
 
6.0%
t4526
 
4.8%
u4397
 
4.7%
Other values (13)21512
23.0%
Uppercase Letter
ValueCountFrequency (%)
M3375
14.5%
B2498
 
10.7%
C2090
 
9.0%
A1546
 
6.7%
V1472
 
6.3%
F1327
 
5.7%
N1303
 
5.6%
H1168
 
5.0%
R1054
 
4.5%
D1038
 
4.5%
Other values (9)6371
27.4%
Space Separator
ValueCountFrequency (%)
1791
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin116857
98.5%
Common1791
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e13773
 
11.8%
o9825
 
8.4%
a8364
 
7.2%
n7068
 
6.0%
d6537
 
5.6%
s6323
 
5.4%
r5659
 
4.8%
i5631
 
4.8%
t4526
 
3.9%
u4397
 
3.8%
Other values (32)44754
38.3%
Common
ValueCountFrequency (%)
1791
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII118648
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e13773
 
11.6%
o9825
 
8.3%
a8364
 
7.0%
n7068
 
6.0%
d6537
 
5.5%
s6323
 
5.3%
r5659
 
4.8%
i5631
 
4.7%
t4526
 
3.8%
u4397
 
3.7%
Other values (33)46545
39.2%

antiquity
Real number (ℝ≥0)

ZEROS

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.710576556
Minimum0
Maximum13
Zeros405
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size71.7 KiB
2021-05-31T23:59:48.948402image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median7
Q310
95-th percentile13
Maximum13
Range13
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.848641332
Coefficient of variation (CV)0.5735187283
Kurtosis-1.190260072
Mean6.710576556
Median Absolute Deviation (MAD)3
Skewness0.02196872808
Sum123025
Variance14.81204011
MonotonicityNot monotonic
2021-05-31T23:59:49.033329image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
91694
9.2%
11596
8.7%
31537
 
8.4%
81464
 
8.0%
41442
 
7.9%
131369
 
7.5%
51353
 
7.4%
121333
 
7.3%
71316
 
7.2%
101311
 
7.2%
Other values (4)3918
21.4%
ValueCountFrequency (%)
0405
 
2.2%
11596
8.7%
21284
7.0%
31537
8.4%
41442
7.9%
51353
7.4%
61175
6.4%
71316
7.2%
81464
8.0%
91694
9.2%
ValueCountFrequency (%)
131369
7.5%
121333
7.3%
111054
5.7%
101311
7.2%
91694
9.2%
81464
8.0%
71316
7.2%
61175
6.4%
51353
7.4%
41442
7.9%

Interactions

2021-05-31T23:59:39.056021image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:39.236088image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:39.333622image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:39.437430image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:39.580948image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:39.678716image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:39.792715image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:39.902318image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.002980image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.100877image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.188702image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.278924image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.371986image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.464385image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.562573image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.753979image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.841412image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:40.931063image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.022162image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.106027image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.188785image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.271867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.362417image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.443080image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.530165image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.622063image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.707710image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.792271image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.878173image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:41.963897image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.057681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.141411image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.227955image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.317318image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.400967image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.486783image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.574141image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.660518image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.749245image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.837606image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:42.923289image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.019244image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.107650image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.196426image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.288287image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.378029image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.470606image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.565569image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.663167image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.765742image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.846972image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:43.928991image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.146107image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.242979image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.340039image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.421913image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.510278image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.601441image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.690982image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.780467image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.868734image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:44.959372image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:45.052689image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-31T23:59:45.139274image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-05-31T23:59:49.115285image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-31T23:59:49.244676image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-31T23:59:49.371027image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-31T23:59:49.500019image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-05-31T23:59:45.330956image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-05-31T23:59:45.568899image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-05-31T23:59:45.739946image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-05-31T23:59:45.823399image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

car_idcar_typecolorkmaverage_pricetransmissionbody_typeDIpricesoldfirst_datediff_pricediff_price_perccar_brandantiquity
00SupraBlue25848512194799.0ManualSedan-1.0228071.0<NA>2019-08-14-33272.0-0.170802Toyota1
1112SupraBlack25913421194799.0ManualSedan-1.0208220.0<NA>2019-12-28-13421.0-0.068897Toyota1
21497SupraBlack24069281194799.0ManualSedan-1.0197099.0<NA>2019-11-05-2300.0-0.011807Toyota1
31891SupraBlue21828354194799.0ManualSedan27.0211287.0TrueNaN-16488.0-0.084641Toyota1
41955SupraBlack25620849194799.0ManualSedan-1.0213144.0<NA>2019-08-26-18345.0-0.094174Toyota1
52435SupraBlue19606492194799.0ManualSedan-1.0204643.0<NA>2019-08-07-9844.0-0.050534Toyota1
62645SupraWhite21322640194799.0ManualSedan12.0216338.0FalseNaN-21539.0-0.110570Toyota1
73640SupraBlack27858456194799.0ManualSedan-1.0227949.0<NA>2019-11-26-33150.0-0.170175Toyota1
83743SupraRed20518362194799.0ManualSedan13.0212860.0TrueNaN-18061.0-0.092716Toyota1
94405SupraWhite24583516194799.0ManualSedan-1.0202939.0<NA>2019-08-11-8140.0-0.041787Toyota1

Last rows

car_idcar_typecolorkmaverage_pricetransmissionbody_typeDIpricesoldfirst_datediff_pricediff_price_perccar_brandantiquity
1832314019S80Blue12582104142081.0AutomaticSUV-1.0161964.0<NA>2019-09-13-19883.0-0.139941Volvo7
1832415396S80Red18468745142081.0AutomaticSUV5.0159724.0FalseNaN-17643.0-0.124176Volvo7
1832515629S80Blue15260331142081.0AutomaticSUV8.0152244.0FalseNaN-10163.0-0.071530Volvo7
1832615855S80White14868440142081.0AutomaticSUV-1.0159666.0<NA>2019-12-18-17585.0-0.123767Volvo7
1832716169S80Black14111995142081.0AutomaticSUV-1.0157873.0<NA>2019-10-26-15792.0-0.111148Volvo7
1832816468S80Black14232987142081.0AutomaticSUV-1.0150419.0<NA>2019-06-16-8338.0-0.058685Volvo7
1832916971S80Black18144875142081.0AutomaticSUV-1.0157175.0<NA>2019-11-03-15094.0-0.106235Volvo7
1833017013S80Red16447711142081.0AutomaticSUV-1.0144139.0<NA>2019-08-12-2058.0-0.014485Volvo7
1833118235S80Red11665859142081.0AutomaticSUV5.0157358.0TrueNaN-15277.0-0.107523Volvo7
1833218543S80Black18127179142081.0AutomaticSUV-1.0159423.0<NA>2019-06-22-17342.0-0.122057Volvo7